334 research outputs found
PhD-SNPg: a webserver and lightweight tool for scoring single nucleotide variants
One of the major challenges in human genetics is to identify functional effects of coding and non-coding single nucleotide variants (SNVs). In the past, several methods have been developed to identify disease-related single amino acid changes but only few tools are able to score the impact of non-coding variants. Among the most popular algorithms, CADD and FATHMM predict the effect of SNVs in non-coding regions combining sequence conservation with several functional features derived from the ENCODE project data. Thus, to run CADD or FATHMM locally, the installation process requires to download a large set of pre-calculated information. To facilitate the process of variant annotation we develop PhD-SNPg, a new easy-to-install and lightweight machine learning method that depends only on sequence-based features. Despite this, PhD-SNPg performs similarly or better than more complex methods. This makes PhD-SNPg ideal for quick SNV interpretation, and as benchmark for tool development
Network measures for protein folding state discrimination
Proteins fold using a two-state or multi-state kinetic mechanisms, but up to now there is not a first-principle model to explain this different behavior. We exploit the network properties of protein structures by introducing novel observables to address the problem of classifying the different types of folding kinetics. These observables display a plain physical meaning, in terms of vibrational modes, possible configurations compatible with the native protein structure, and folding cooperativity. The relevance of these observables is supported by a classification performance up to 90%, even with simple classifiers such as discriminant analysis
Synergistic toxicity of some sulfonamide mixtures on Daphnia magna
In livestock farming, sulfonamides (SAs) are used prophylactically and simultaneously in large numbers of animals. Therefore, traces of these compounds, alone or in combination, have been repeatedly detected in the environment. Synergistic interactions among chemicals in such mixtures represent an area of concern for the regulatory authorities. In this study, the acute toxic effects of binary and ternary mixtures of SAs were evaluated in Daphnia magna, in order to verify whether, based on their individual toxicity, they jointly exert a larger effect than would be predicted by individual actions alone. First, following the Concentration Addition (CA) principle, some preliminary observations were made by testing a number of drug combinations with an expected 50% effect. Then, mixtures more recognised for their synergistic effect (four binary and two ternary) were assayed in a range of reducing concentrations. The data acquired were processed using CompuSyn software, which integrates the different shape of the curves obtained in calculating the Combination Index (CI) for the evaluation of synergistic effects. For binary mixtures, synergy was also evaluated using the curvilinear isobologram method for heterodynamic drugs. Results indicate that most of the selected mixtures exhibit a synergistic effect using the CI methodology. For binary mixtures, these findings were also confirmed by isobologram analysis. Detected synergies indicate that the CA is not always precautionary as a reference model for the evaluation of the aquatic toxicity of SAs mixtures
The posterior-Viterbi: a new decoding algorithm for hidden Markov models
Background: Hidden Markov models (HMM) are powerful machine learning tools
successfully applied to problems of computational Molecular Biology. In a
predictive task, the HMM is endowed with a decoding algorithm in order to
assign the most probable state path, and in turn the class labeling, to an
unknown sequence. The Viterbi and the posterior decoding algorithms are the
most common. The former is very efficient when one path dominates, while the
latter, even though does not guarantee to preserve the automaton grammar, is
more effective when several concurring paths have similar probabilities. A
third good alternative is 1-best, which was shown to perform equal or better
than Viterbi. Results: In this paper we introduce the posterior-Viterbi (PV) a
new decoding which combines the posterior and Viterbi algorithms. PV is a two
step process: first the posterior probability of each state is computed and
then the best posterior allowed path through the model is evaluated by a
Viterbi algorithm.
Conclusions: We show that PV decoding performs better than other algorithms
first on toy models and then on the computational biological problem of the
prediction of the topology of beta-barrel membrane proteins.Comment: 23 pages, 3 figure
Embedding machine-readable proteins interactions data in scientific articles for easy access and retrieval
Extraction of protein-protein interactions data from scientific literature remains a hard, time- and resource-consuming task. This task would be greatly simplified by embedding in the source, i.e. research articles, a standardized, synthetic, machine-readable codification for protein-protein interactions data description, to make the identification and the retrieval of such very valuable information easier, faster, and more reliable than now.
We shortly discuss how this information can be easily encoded and embedded in research papers with the collaboration of authors and scientific publishers, and propose an online demonstrative tool that shows how to help and allow authors for the easy and fast conversion of such valuable biological data into an embeddable, accessible, computer-readable codification
PhD-SNPg: updating a webserver and lightweight tool for scoring nucleotide variants
One of the primary challenges in human genetics is determining the functional impact of single nucleotide variants (SNVs) and insertion and deletions (InDels), whether coding or noncoding. In the past, methods have been created to detect disease-related single amino acid changes, but only some can assess the influence of noncoding variations. CADD is the most commonly used and advanced algorithm for predicting the diverse effects of genome variations. It employs a combination of sequence conservation and functional features derived from the ENCODE project data. To use CADD, a large set of pre-calculated information must be downloaded during the installation process. To streamline the variant annotation process, we developed PhD-SNPg, a machine-learning tool that is easy to install and lightweight, relying solely on sequence-based features. Here we present an updated version, trained on a larger dataset, that can also predict the impact of the InDel variations. Despite its simplicity, PhD-SNPg performs similarly to CADD, making it ideal for rapid genome interpretation and as a benchmark for tool development
SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments
Motivation: Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies.
Results: In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single-and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization
NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases
Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions
In silico evidence of the relationship between miRNAs and siRNAs
Both short interfering RNAs (siRNAs) and microRNAs (miRNAs) mediate the
repression of specific sequences of mRNA through the RNA interference pathway.
In the last years several experiments have supported the hypothesis that siRNAs
and miRNAs may be functionally interchangeable, at least in cultured cells. In
this work we verify that this hypothesis is also supported by a computational
evidence. We show that a method specifically trained to predict the activity of
the exogenous siRNAs assigns a high silencing level to experimentally
determined human miRNAs. This result not only supports the idea of siRNAs and
miRNAs equivalence but indicates that it is possible to use computational tools
developed using synthetic small interference RNAs to investigate endogenous
miRNAs.Comment: 8 pages, 2 figure
- …